AITopics | Wood County

Collaborating Authors

Wood County

Bayesian information theoretic model-averaging stochastic item selection for computer adaptive testing: compromise-free item exposure

arXiv.org Machine LearningApr-21-2025

The goal of Computer Adaptive Testing (CAT) is to reliably estimate an individual's ability as modeled by an item response theory (IRT) instrument using only a subset of the instrument's items. A secondary goal is to vary the items presented across different testing sessions so that the sequence of items does not become overly stereotypical -- we want all items to have an exposure rate sufficiently far from zero. We formulate the optimization problem for CAT in terms of Bayesian information theory, where one chooses the item at each step based on the criterion of the ability model discrepancy -- the statistical distance between the ability estimate at the next step and the full-test ability estimate. This viewpoint of CAT naturally motivates a stochastic selection procedure that equates choosing the next item to sampling from a model-averaging ensemble ability model. Using the NIH Work Disability Functional Assessment Battery (WD-FAB), we evaluate our new methods in comparison to pre-existing methods found in the literature. We find that our stochastic selector has superior properties in terms of both item exposure and test accuracy/efficiency.

artificial intelligence, bayesian inference, machine learning, (15 more...)

arXiv.org Machine Learning

2504.15543

Country:

North America > United States > Wisconsin > Wood County > Wisconsin Rapids (0.04)
North America > United States > Ohio > Franklin County > Columbus (0.04)
North America > United States > New York > New York County > New York City (0.04)
(2 more...)

Genre: Research Report (0.82)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

A Rigorous Machine Learning Analysis Pipeline for Biomedical Binary Classification: Application in Pancreatic Cancer Nested Case-control Studies with Implications for Bias Assessments

Urbanowicz, Ryan J., Suri, Pranshu, Cui, Yuhan, Moore, Jason H., Ruth, Karen, Stolzenberg-Solomon, Rachael, Lynch, Shannon M.

arXiv.org Machine LearningSep-8-2020

Machine learning (ML) offers a collection of powerful approaches for detecting and modeling associations, often applied to data having a large number of features and/or complex associations. Currently, there are many tools to facilitate implementing custom ML analyses (e.g. scikit-learn). Interest is also increasing in automated ML packages, which can make it easier for non-experts to apply ML and have the potential to improve model performance. ML permeates most subfields of biomedical research with varying levels of rigor and correct usage. Tremendous opportunities offered by ML are frequently offset by the challenge of assembling comprehensive analysis pipelines, and the ease of ML misuse. In this work we have laid out and assembled a complete, rigorous ML analysis pipeline focused on binary classification (i.e. case/control prediction), and applied this pipeline to both simulated and real world data. At a high level, this 'automated' but customizable pipeline includes a) exploratory analysis, b) data cleaning and transformation, c) feature selection, d) model training with 9 established ML algorithms, each with hyperparameter optimization, and e) thorough evaluation, including appropriate metrics, statistical analyses, and novel visualizations. This pipeline organizes the many subtle complexities of ML pipeline assembly to illustrate best practices to avoid bias and ensure reproducibility. Additionally, this pipeline is the first to compare established ML algorithms to 'ExSTraCS', a rule-based ML algorithm with the unique capability of interpretably modeling heterogeneous patterns of association. While designed to be widely applicable we apply this pipeline to an epidemiological investigation of established and newly identified risk factors for pancreatic cancer to evaluate how different sources of bias might be handled by ML algorithms.

algorithm, artificial intelligence, machine learning, (18 more...)

arXiv.org Machine Learning

2008.12829

Country:

North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.14)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > Wisconsin > Wood County > Marshfield (0.04)
(10 more...)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.93)

Industry:

Health & Medicine > Therapeutic Area > Gastroenterology (1.00)
Health & Medicine > Therapeutic Area > Endocrinology (1.00)
Education > Health & Safety > School Nutrition (1.00)
Health & Medicine > Therapeutic Area > Oncology > Pancreatic Cancer (0.72)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

Statistical Relational Learning to Predict Primary Myocardial Infarction from Electronic Health Records

Weiss, Jeremy C. (University of Wisconsin-Madison) | Natarajan, Sriraam (Wake Forest University) | Peissig, Peggy L. (Marshfield Clinic Research Foundation) | McCarty, Catherine A. (Essentia Institute of Rural Health) | Page, Daivd (University of Wisconsin-Madison)

AAAI ConferencesJul-21-2012

Electronic health records (EHRs) are an emerging relational domain with large potential to improve clinical outcomes. We apply two statistical relational learning (SRL) algorithms to the task of predicting primary myocardial infarction. We show that one SRL algorithm, relational functional gradient boosting, outperforms propositional learners particularly in the medically-relevant high recall region. We observe that both SRL algorithms predict outcomes better than their propositional analogs and suggest how our methods can augment current epidemiological practices.

algorithm, gradient, risk factor, (16 more...)

AAAI Conferences

Twenty-Fourth IAAI Conference

Country:

North America > United States > Wisconsin > Dane County > Madison (0.14)
North America > Greenland (0.04)
North America > United States > Wisconsin > Wood County > Marshfield (0.04)
(3 more...)

Genre: Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
Health & Medicine > Health Care Technology > Medical Record (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Demand-Driven Clustering in Relational Domains for Predicting Adverse Drug Events

Davis, Jesse, Costa, Vitor Santos, Peissig, Peggy, Caldwell, Michael, Berg, Elizabeth, Page, David

arXiv.org Artificial IntelligenceJun-27-2012

Learning from electronic medical records (EMR) is challenging due to their relational nature and the uncertain dependence between a patient's past and future health status. Statistical relational learning is a natural fit for analyzing EMRs but is less adept at handling their inherent latent structure, such as connections between related medications or diseases. One way to capture the latent structure is via a relational clustering of objects. We propose a novel approach that, instead of pre-clustering the objects, performs a demand-driven clustering during learning. We evaluate our algorithm on three real-world tasks where the goal is to use EMRs to predict whether a patient will have an adverse reaction to a medication. We find that our approach is more accurate than performing no clustering, pre-clustering, and using expert-constructed medical heterarchies.

artificial intelligence, latent structure, machine learning, (15 more...)

arXiv.org Artificial Intelligence

1206.6399

Country:

North America > United States > Wisconsin > Dane County > Madison (0.14)
Europe > Belgium > Flanders > Flemish Brabant > Leuven (0.04)
North America > United States > Wisconsin > Wood County > Marshfield (0.04)
(2 more...)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)

Add feedback